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Abstract 

Identifying the genetic and ecological basis of adaptation is of immense importance in 
evolutionary biology. In our study, we applied a panel of 58 biallelic single nucleotide 
polymorphisms (SNPs) for the economically and culturally important salmonid Oncorhynchus 
keta. Samples included 4164 individuals from 43 populations ranging from Coastal Western 
Alaska to southern British Colombia and northern Washington. Signatures of natural selection 
were detected by identifying seven outlier loci using two independent approaches: one based on 
outlier detection and another based on environmental correlations. Evidence of divergent 
selection at two candidate SNP loci, Oke_RFC2-168 and Oke_MARCKS-362, indicates 
significant environmental correlations, particularly with the number of frost-free days (NFFD). 
Important associations found between environmental variables and outlier loci indicate that those 
environmental variables could be the major driving forces of allele frequency divergence at the 
candidate loci. NFFD, in particular, may play an important adaptive role in shaping genetic 
variation in O. keta. Correlations between divergent selection and local environmental variables 
will help shed light on processes of natural selection and molecular adaptation to local 
environmental conditions. 
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Introduction 

Chum salmon, Oncorhynchus keta, has the largest range among all Pacific salmon species in the 
North Pacific Rim (Sato et ah 2004), and it constitutes an important part of the Pacific Rim 
ecosystem (Seeb et ah 2011). All Pacific salmon species are anadromous with the exception of 
some species, particularly sockeye and masu salmom, having nonanadromous populations (Quinn 
2005). They hatch in fresh water, then migrate to sea where they spend most of their lives, and 
return to fresh water to spawn at maturity. Pacific salmon have a strong tendency to return from 
ocean feeding areas to their natal streams to spawn, and this strong homing behavior can result in 
reproductive isolation of populations (Quinn 2005). 

Pacific salmon vary greatly in age and size at maturity, morphology, and timing of spawning 
(Beacham and Murray 1987). Differences in spawning time and location result in distinct salmon 
stocks. Stocks might be genetically different due to their adaptations to their particular spawning 
locations; different spawning, incubation, and rearing environments may thus cause their 
adaptive genetic differences (Beacham and Murray 1987). 

Patterns of environmental variation can shape adaptive genetic variation across a species' 
range. Environmental heterogeneity subjects populations to varying selective pressures, and this 
may lead to local adaptation (Schoville et ah 2012). Natural selection can play a major 
diversifying role in salmonid populations, and environmental variables such as water temperature, 
stream size, female choice, and predation risk have been found to be among the most influential 
agents of natural selection in this system (Garcia de Leaniz et ah 2007). Thus, correlations 
between genetic variation and environmental gradients can be interpreted as evidence of natural 
selection by uncovering loci that are linked to selected genes (Eckert et ah 2010). 

Genome-wide scans of patterns of single nucleotide polymorphisms (SNPs) can reveal 
patterns of adaptive genetic variation by detecting locus-specific signatures of positive selection. 
An alternative way of uncovering signs of local adaptation is by identifying significant 
correlations between genetic polymorphisms and environmental variables (Coop et ah 2010). 
Nonetheless, associations found between environmental variables and adaptive genetic 
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divergence do not necessarily indicate causal relationships, but they can provide insights into the 
selective forces potentially generated by environmental variables in natural selection (McColl and 
McKechnie, 1999). 

The purpose of this study was to detect signatures of local adaptation in O. keta by 
investigating the correlations between patterns of allele frequency differentiation and 
environmental gradients. We hypothesize that given the wide range of environmental conditions 
encountered by O. keta, there are corresponding adaptive genetic differences among the salmon 
populations. We apply a panel of 58 SNPs to 43 populations of chum salmon ranging from 
coastal western Alaska to southern British Colombia and northern Washington. We apply the 
SNP database to detect signatures of natural selection in O. keta by identifying SNP outlier loci 
in terms of the genetic differentiation index Fst- In addition, we examine the correlations between 
allele frequencies at outlier loci and environmental variables. 

Methods 

Data collection 

In the present study, we obtained a published data set (Seeb et al. 201 1) available from the online 
repository DRYAD ( www . datadry ad. org ) . We selected a subset of the representative populations 
distributed throughout the Pacific Northwest (Fig 1). As a result, a total of 58 biallelic single 
nucleotide polymorphisms (SNPs) genotyped in 4164 individuals from 43 different populations 
of chum salmon were studied. The data set was sampled in regions ranging from Coastal Western 
Alaska to Southern British Colombia and Northern Washington (Fig. 1). Additionally, we 
collected sixty monthly, seasonal and annual environmental variables related to temperature, 
precipitation and topography for each sampling location using climateWNA 
( http://www.genetics.forestry.ubc.ca/cfcg/ClimateWNA/ClimateWNA.html ; Wang et al. 2012). 
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Figure 1. Map of sampling locations for 43 populations of chum salmon. Blue symbols indicate 
populations from the northern lineage; red symbols indicate populations from the southern 
lineage as identified using STRUCTURE. 
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Genetic variation 

All SNP markers have already been tested to conform to Hardy- Weinberg (HW) equilibrium in 
Seeb et al. (2011), and four linked SNP pairs were found to display patterns of significant 
gametic disequilibrium. Since linkage was only observed within a handful of populations, we 
retained all SNP markers for further analyses. 

Environmental variables and PCA 

A principal component analysis (PCA) using the statistical software R (package ade 4) was 
applied to all two hundred and sixteen environmental variables to examine possible correlations 
between all variables conducted. Variables that were correlated at |r| > 0.8 were considered 
redundant and were thus removed (Manel 2010). Within each pair of variables that were highly 
correlated, we kept only one biologically relevant variable. Therefore, only environmental 
variables identified as being uncorrelated by the PCA analysis were used. 

Population structure 

The extent of population differentiation was quantified by calculating pairwise Fst for each 
locus and over all loci among all 43 chum salmon populations using Arlequin 3.5 (Excoffier et al. 
2009). In addition, we used the software STRUCTURE 3.4 (Pritchard et al. 2000) to detect 
clusters (K) of individuals. We ran 10 chains of 100,000 iterations with K ranging from 1 to 10. 
We then ran the structure program with the exclusion of the outlier loci to investigate if the 
outliers had an effect on the population structure. We used STRUCTURE HARVESTER Web 
vO.6.92 (Earl and vanHoldt 2012) to estimate Delta K (AK) and (In P(K)), the natural log 
probability of K. 
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Signature of natural selection 

Arlequin 3.5 (Excoffier et al. 2009) was used to detect outlier loci. We ran 100,000 simulations 
assuming 100 demes per group using a finite island model without taking into account the 
underlying population structure. We then ran the software again using the hierarchical island 
model by grouping populations into two groups representing the northern and the southern 
lineages based on the population clusters identified by STRUCTURE. Loci that fell above the 
99% quantile were determined to be under positive selection, and loci that fell below the 99% 
quantile were considered to be candidates for balancing selection. 

Environmental effects on adaptive genetic variation 

In order to reinforce evidence of natural selection acting on outlier loci, we applied an alternative 
approach that utilizes associations between environmental variables and patterns of allele 
frequencies in identifying genetic adaptive variation. Allele frequencies are typically correlated 
among geographically closer populations; as a result, the association found between 
environmental variables and allele frequencies could occur by chance due to isolation by distance 
or similar environmental factors acting on geographically proximate populations (Limborg et al. 
2011; Coop et al. 2010). If not taken in to account, signals from neutral population structure 
could lead to a high false positive rate (Coop et al. 2010). To overcome false positivity caused by 
neutral population structure, we accounted for spatial autocorrelation when testing for 
correlations between allele frequencies and environmental variables. Applying 150,000 iterations 
in Bayenv (Coop et al. 2010), we tested for correlations between the SNP allele frequencies and 
the following variables that were identified as being uncorrelated by the PC A analysis: (1) 
precipitation as snow (mm) (PAS), (2) degree-days above 5°C, growing degree-days (DD5), (3) 
mean annual precipitation (mm) (MAP), (4) the number of frost-free days (NFFD), (5) 
degree-days above 18°C, cooling degree-days (DD18), (6) mean warmest month temperature 
(°C) (MWMT), (7) annual heatmoisture index (MAT (mean annual temperature (°C)) 
+10)/(MAP/1000)) (AHM), (8) temperature difference between MWMT and MCMT (mean 
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coldest month temperature (°C), or continentality (°C) (TD), (9) the Julian date on which FFP 
begins (bFFP), (10) mean annual summer (May to Sept.) precipitation (mm) (MSP), (11) summer 
heatmoisture index ((MWMT)/(MSP/1000)) (SHM), (12) longitude, (13) latitude and (14) 
elevation. 

Results 

Genetic diversity 

Among all chum salmon populations, varying levels of differentiation were revealed with values 
ranging from 0.0311 to 0.157 and the average over all SNP markers was 0.0564. Genetic 
diversity was measured by average heterozygosity per locus and the level of genetic diversity 
ranged from 0.224 to 0.299 (He), 0.227 to 0.304 (Ho) in all 43 populations (Table 1). 
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Table 1. Location of spawning populations, average observed heterozygosity (Ho) and average 
expected heterozygosity (Seeb et al. 201 1). 



Population 



Mean Ho 



Mean He 



Geographic origin 



CMKOB91 

CMNOA91 

CMFISH04 

CMKWIN04 

CMNIUK04 

CMNOME05 

CMPIL94 

CMSNA9395 

CMSOL9356 

CMUNA9204 

CMAND93W 

CMOTTANV93 

CMGIS94 

CMYUKA93 

CMNUL94 

CMTOZI03 

CMMEL94 

CMCHE94 



0.300 
0.293 
0.300 
0.296 
0.296 
0.292 
0.282 
0.287 
0.295 
0.286 
0.301 
0.283 

0.304 
0.290 
0.290 
0.295 
0.286 
0.279 



0.292 
0.290 
0.292 
0.286 
0.293 
0.298 
0.289 
0.284 
0.292 
0.289 
0.288 
0.291 

0.296 
0.288 
0.284 
0.286 
0.284 
0.282 



Kobuk River 

Noatak River 

Fish River 

Kwiniuk River 

Niukluk River 

Nome River 

Pilgrim River 

Snake River 

Solomon River 

Unalakleet River 

Andreafsky River 

Anvik, Beaver /Otter 
Ck 

Gisasa River 
Innoko River 
Nulato River 
Tozitna River 
Melozitna River 
Chena River 
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CMHENS95 


0.279 


0.271 


Henshaw Creek 


CMSAL01 


0.267 


0.269 


Salcha River 


CMBCK95 


0.276 


0.274 


Big Creek, 
Mainstem 


CMPEL93 


0.273 


0.275 


Pelly River 


CMDON94 


0.262 


0.261 


Donjek River (White) 


CMFBR94 


0.275 


0.267 


Fishing Branch 
(Porcupine) 


CMBSAL01 


0.271 


0.273 


Big Salt 


CMSHE92 


0.279 


0.280 


Sheenjek River 
(Porcupine) 


CMBLU92 


0.283 


0.282 


Bluff Cabin 


CMDEL9294 


0.287 


0.284 


Delta River 


CMTAN93 


0.278 


0.276 


Kantishna River 


CMTOK94GS 


0.283 


0.284 


Toklat River/Geiger 


CMBRIA93 


0.281 


0.287 


Whale Creek 
(Egegik) 


CMBRIC93 


0.284 


0.282 


Pumice Creek 
(Ugashik) 


CMMES92 


0.297 


0.299 


Meshik River 


CMWESB93 


0.277 


0.280 


Plenty Bear Creek, 
^iviesniK ) 


CMILNIK02 


0.260 


0.259 


Ilnik River 


CM24MI06 


0.260 


0.244 


24 Mile - Chilkat 
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River 



CMDIPAC06 
CMHFHAT06 

CMTAKU06F 
CMNARM06S 
CMNISQ04 
CMELWH04 



0.237 
0.245 

0.258 
0.250 
0.227 
0.229 



0.238 
0.244 

0.254 
0.249 
0.224 
0.225 



Macaulay Hatchery 

Hidden Falls 
Hatchery 

Taku River - fall 

North Arm Creek 

Nisqually River 

Elwha River 
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PCA analysis of environmental variables 

Fourteen environmental (climatic and topographic) variables were identified as being 
uncorrelated from the PCA analysis and were therefore retained (Fig. 2) 




Figure 2. Representation of the fourteen retained environmental (climatic and topographic) 
variables on a principle component plot. 
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Detection of loci under selection 

The genome scan assuming a finite island model showed an excess of outlier loci (Fig. 3a). 
Undetected population substructure can often have negative effects on outlier tests for SNP loci 
under selection by causing inflated false-positive rates (Huelsenbeck and Andolfatto 2007), we 
thereby took into account the hierarchical population structure when running the Arlenquin test. 
Outlier loci were thus significantly reduced after accounting for a hierarchical structure based on 
the population clusters (K=2) previously identified by STRUCTURE. Using the hierarchical 
island model, a total of six outlier loci were detected with Arlenquin. We revealed three 
significant outliers for divergent selection (P<0.01) at the Fst level (Fig. 3b), Oke _FARSLA-242 , 
Okeul-519 and Oke_RFC2-168, of which two loci were also candidates at the F C t level, 
Okeul-519 and Oke_RFC2-168 (Fig. 3c). Outlier, Oke MARC KS- 3 62, was detected only at the 
Fct level as a potential candidate for divergent selection. Two loci that lie below the 99% 
quantile were considered as candidates under balancing selection, Oke_TCPl-78 and 
Okearf-319 (Fig. 3b and 3c). Bayenv confirmed two out of the six outliers (p < 0.01) found with 
Arlenquin, Oke_RFC2-168 and Oke_MARCKS-362, and detected a new outlier, Oke_Tf-278. 
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Figure 3. Outlier tests for detection of loci under selection using the method of Excoffier et al. 
(2009). (a) Fs7-based test plotted against heterozygosity under the assumptions of a finite island 
model, (b) F^based test assuming a hierarchical island model by grouping all populations into 
two major groups, (c) Fc^based test assuming a hierarchical island model as in (b). The red solid 
lines represent the 1% quantiles from coalescent simulations; dashed lines indicate the 5% and 
10% quantiles. 

Environmental correlates 

Allele frequencies at the two SNP loci identified by Arlequin as being under positive selection 
(P<0.01), Oke_RFC2-168 and Oke_MARCKS-362, were significantly correlated with one or two 
variables (Table 2). One new outlier, Oke_Tf-278, detected by Bayenv was found to associate 
with one environmental variable, the number of frost- free days (NFFD) (Table 2). The rest of the 
outlier loci, Oke_FARSLA-242, Oke_TCPl-78, Oke ul-519 and Oke arf-319, were not 
correlated with any environmental variable(s) by Bayenv. 



14 



Downloaded from http://biorxiv.org/on September 18, 2014 



Table 2. Summary of seven outlier loci, their locus names and results from Bayesian inference 
for correlation between allele frequencies and environmental variables. 





Locus 




l^UCUa IlalllC 


v^oiieidieu vdiidDies 


rosiiivc selection 


Oke_MARCKS-362 


iviyrisioyiaieQ 


Long 




(Elfstrom et 


al. 


aianine-ncn proiein 






2007) 




Kinase suosiraie 






Oke_RFC2-168 




Replication factor 


INrrD, ID 




(Smith ef a/. 2005b) 








Okeul-519 




Unknown 


None 




(Smith ef a/. 2005a) 








Oke_FARSLA-242 


• 

Phenylalanine-tRNA 


None 




(Elfstrom et 


al. 


synmeiase-iiKe, 






2007) 




alpha subunit 




Balancing selection 


Oke_TCPl-78 




Similar to 


None 




(Elfstrom et 


al. 


T-complex protein 






2007) 




1 ,epsilon subunit 
(TCP-l-epsilon) 






Okearf-319 




Unknown 


None 




(Smith ef a/. 2005b) 






N/A 


Oke_7f-278 
(Elfstrom e£ 
2007) 


al. 


Transferrin 


NFFD 



Bayenv method revealed a strong correlation between allele frequencies at Oke_RFC2-168 
and the number of frost-free days (NFFD). As the number of frost-free days increases, the allele 
frequency decreases (Fig. 4). When the number of frost-free days (NFFD) reaches beyond 200 
days, the allele frequency at Oke_RFC2-168 tends towards a value near zero. One dot indicating 
a northern population stands apart due to its geographic proximity to the southern populations, 
hence similar number of frost- free days. However, it maintains its higher allele frequency due to 
its northern lineage. 
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Figure 4. The correlation between allele frequencies at locus RFC2-168 and the number of 
frost-free days (NFFD). Frequency of only a single allele is given for the SNP locus. 

The allele frequency at Oke_RFC2-168 is also positively correlated with TD (the temperature 
difference between the mean warmest month temperature (°C) and the mean coldest month 
temperature (°C)). The allele frequency increases steadily from 0 to 0.7 as TD increases from 0 
°C to 45°C (Fig. 5). 



16 



Downloaded from http://biorxiv.org/on September 18, 2014 



CNJ 

O 

Ll_ 



o 



CO 

o 



LO 

o 



o 



CO 

o 



CM 

o 



o 
o 



o o oo 



o 



o 
o 



o 
oo 



o o 
o o 



15 



20 



25 



— r~ 

30 
TD 



35 



40 



45 



Figure 5. The correlation between allele frequencies at Oke_RFC2-168 and TD (the temperature 
difference between the mean warmest month temperature (°C) and the mean coldest month 
temperature (°C)). Frequency of only a single allele is given for the SNP locus. 



The allele frequency at Oke_MARCKS-362 varied with longitude. The allele frequency is 
relatively high when longitude is less than 140 degrees (southern populations), and it decreases 
substantially and stabilizes after longitude increases beyond 140 degrees (northern populations) 
(Fig. 6). 
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Figure 6. The correlation between allele frequencies at Oke_MARCKS-362 and longitude (Long). 
Frequency of only a single allele is given for the SNP locus. 

The allele frequency at Oke_Tf-278 is positively correlated to NFFD and it increases as 
NFFD increases from 0 to 200 days (Fig. 7). 
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Figure 7. The correlation between allele frequencies at Oke_Tf278 and the number of frost- free 
days (NFFD). Frequency of only a single allele is given for the SNP locus. 

Population structure 

A strong neutral population structure was revealed (Fig. 8). Cluster analysis with STRUCTURE 
grouped populations into K=2 clusters. There are clear distinctions between two population 
clusters (Fig. 8a); after all six outliers found with Arlenquin were removed, however, the 
distinction between the two clusters was blurred (Fig. 8b). 
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a) 




Figure 8. Number of population clusters (K) detected by STRUCTURE program 2.0 (a) test 
run by including all 58 SNP loci, and (b) test run by excluding all six outliers detected by 
Arlenquin. 
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Discussion 

In our study, significant associations between environmental variables and several outlier loci 
were found. These loci are potentially important in the local adaptation to local environments, 
and they might be under selection driven by the environmental variables. Locus Oke_Tf-278 
(Transferrin; Elfstrom et al. 2007) was shown to be an outlier by both independent approaches 
we implemented, and its allele frequency was positively correlated with the number of frost-free 
days (NFFD). Oke_Tf-278 was also shown to be under positive selection at the 95% confidence 
level by Seeb et al. (2011) in their study undertaken throughout the species' range of O. keta. 
Evidence for positive selection at the transferrin gene among salmonid species was also found in 
other studies (Ford 2001). Transferrins are iron-binding proteins that are involved in iron storage 
and resistance to bacterial disease. Since iron is often a limiting source of nutrient in bacterial 
growth, transferrin may provide resistance to bacterial infection by iron binding (Ford 2001). 
Therefore, iron competition with salmonid pathogens could potentially act as a selective pressure 
on the transferrin gene. It has been found in some salmon populations that a specific transferrin 
genotype can confer resistance to bacterial infection (Ford 2001). Warmer temperatures have 
been shown to increase the infection rate of fish pathogens (Richter and Kolmes 2005). In 
salmonids species, populations that are already stressed by high water temperatures, manifested 
by weight loss, disease and displacement by better-adapted species, are more susceptible to 
pathogen infections (Richter and Kolmes 2005). As a result, a positive correlation between the 
number of frost-free days (NFFD) and the allele frequency at Oke_Tf-278 may be explained by 
the presence of different intensities of bacterial infections in different environments with varying 
temperatures, with more infections in the south due to warmer temperatures. 

Oke_RFC2-168 was shown to be under positive selection and to correlate with two 
environmental variables: the number of frost- free days (NFFD) and the temperature difference 
between the mean warmest month temperature (°C) and the mean coldest month temperature (°C) 
(TD). Oke_RFC2-168 was also shown to be under positive selection at the 99% confidence level 
by Seeb et al. (2011). Oke_RFC2-168 is situated in the gene that codes for replication factor 
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(RFC), a five-subunit DNA polymerase accessory protein, which is involved in both DNA 
replication and repair (Seeb et ah 201 1). RFC is a protein complex that functions to facilitate the 
assembly of active replication complexes (Bowman et al. 2004). RFC is essentially a clamp 
loader for the sliding clamp, PCNA (proliferating cell nuclear antigen), and it can form a stable 
ATP-dependent complex with the sliding clamp. The a- helices and the amino acid residues that 
are crucial for DNA interaction are conserved in all five RFC subunits (Bowman et al. 2004). The 
SNP locus 0£e_RFC2-168 is identified as a synonymous substitution within the coding region of 
the RFC2 gene (Seeb et ah 201 1). 

Overall, we observed a strong effect of environmental variables on the allele frequencies at 
locus Oke_RFC2-168 (Fig. 4 and 5). An adaptation along environmental gradients, shown by a 
high allele frequency at Oke_RFC2-168 where NFFD is low and TD is high (characteristic of 
more northern, continental climates) and a low allele frequency where NFFD is high and TD is 
low (characteristic of more southern and coastal climates), was also observed. The above findings 
suggest that the population that has a higher allele frequency at Oke_RFC2-168 could have an 
adaptive advantage in extreme weather conditions where TD is high and NFFD is low. Likewise, 
the population that has a lower allele frequency at Oke_RFC2-168 might be best adapted to a 
temperate climate where TD is low and NFFD is high. Several studies have suggested an 
important role of temperature and photoperiod in determining salmonid migration pattern and the 
physiological changes during smolt development (Skyes and Shrimpton 2010; Sykes et al. 2009). 
Increases in photoperiod and temperature can stimulate physiological changes associated with 
increased saltwater tolerance; temperature, along with turbidity and flow, triggers actual 
migration. The timing and duration of the migration of salmonid smolts are crucial to their 
survival in marine environment (Skyes and Shrimpton 2010; Sykes et ah 2009). As a result, the 
effect of temperature and photoperiod on salmonid smolting can lend support to important 
adaptive roles of both environmental variables, TD and NFFD. Yet, the SNP Oke_RFC2-168 was 
identified as a synonymous substitution within the coding region of RFC gene, so we argue that 
Oke_RFC2-168 could be linked to a favorable allele at a closely linked SNP locus, which was not 
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examined in our study, in the coding region of the RFC gene. 

Oke_MARCKS-362 was found to be under positive selection and is associated with the gene 
that encodes myristoylated alanine -rich protein kinase C substrate (Elfstrom et al. 2007). The 
gene product of Oke_MARCKS-362 is myristoylated alanine-rich C kinase substrate (MARCKS), 
a substrate for protein kinase C and is thought to be involved in cell motility, phagocytosis, 
membrane trafficking and mitogenesis (Dulong et al. 2004). We show that allele frequency at 
Oke MARCKS- 3 62 correlates significantly with longitude. The correlation between the allele 
frequency at OkeJMARCKS-362 and longitude is not straightforward and could be due to other 
confounding effects as we only took into considerations the abiotic factors. Biotic factors such as 
predation risk, female choice and pathogens can also act as selective agents in salmonid 
populations (Garcia de Leaniz et al. 2007). Therefore, biotic factors could have imposed selective 
pressures on chum salmon populations inhabiting along a longitudinal gradient. 

Ultimately, statistical correlations between environmental variation and allele frequency 
differentiation do not constitute infallible evidence for natural selection (Schoville et al. 2012). 
To establish the cause and effect relationships between environmental variables and genetic 
variation, it is necessary to link genetic variation to phenotypic variation as well as gene 
functions to fitness differences (Barrett and Hoekstra 2011). This can be achieved by conducting 
direct measurements in the field or controlled laboratory experiments (Schoville et al. 2012). 
Moreover, functional SNPs that cause intraspecific phenotypic variation should help generate 
insights into the relationship between genotype and phenotype and hence the adaptive 
consequences of these SNPs (Macdonald and Long 2005). Further genetic sequencing and 
genome annotation will be required to identify functional genetic variation across genomes and to 
examine the adaptive functions of SNPs (Seeb et al. 201 1; Macdonald and Long 2005). 

As a concluding remark, the study of adaptive genetic variation in response to environmental 
variables may help predict how chum salmon populations will adapt to climate change. One 
prediction states that in the face of climate warming, gene migration from the pre-adapted 
populations in warmer climates will help to promote adaptation of the populations at the leading 
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edge of the migrating front by bringing them better adapted alleles (Davis and Shaw 2001). Since 
populations at the trailing edge receive no gene flow from the pre-adapted populations, they are 
more likely to become locally extinct when facing the negative effects of the climate change 
(Davis and Shaw 2001). 
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